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USING RETRIEVAL-AUGMENTED GENERATION TO ELEVATE LOW- 
CODE DEVELOPER SKILLS 


Abstract. This article proposes applying retrieval-augmented generation (RAG) to improve the skills of low-code 
developers by augmenting large language models with up-to-date domain-specific knowledge. As low-code development 
requires combining multiple systems into a final product, developers must consult several sources of documentation and 
various articles, videos, and forum threads. Such a process may be time-consuming, prompting the use of an LLM for the 
authoritative answer. However, LLMs often lack knowledge of low-code platforms, leading to hallucinations and 
superficial responses. RAG utilizes the benefits of LLMs on relevant information, suggesting a presumption that it may 
be effectively applied in low-code development. Heterogeneous data sources concerning low-code systems are converted 
to a text representation, split into logical chunks, and stored in a vector database. During the exploitation of the model, 
cosine similarity is used to retrieve top-K documents and concatenate them with user query, using the produced text as a 
prompt to an LLM. The results support the hypothesis that RAG models outperform standard LLMs in knowledge 
retrieval in this domain. 
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Introduction solutions and thus must consult numerous 

Low-code development refers to sources simultaneously. 
building software using specialized tools and Perhaps the first resort the developers 
solutions that streamline the development turn to when confronted with a problem is a 
process. Such ease is facilitated by pre-built search engine like Google. Unfortunately, even 
components that combine — everyday though search engines try to provide the most 
programming tasks into distinct entities. These relevant information, they can overwhelm the 
entities can then be extensively tested to ensure user with the number of links and potential 
the high quality of developed software across solutions. Additionally, many developing low- 
all companies that utilize the low-code code platforms lack extensive forum threads, 
paradigm. articles, and discussions concerning every 

However, the providers of low-code possible issue that can arise while creating a 
tools often leave the customers with little product. 
guidance on using those tools. Admittedly, The next possible solution for a problem 
most low-code platforms host up-to-date is a large language model (LLM) like ChatGPT 
documentation that covers the potential that uses transformer [1] architecture. It 
questions and describes the provided tools in utilizes the knowledge from the data it has 
great detail. Nevertheless, adopters of low- been trained on to give a single answer to a 
code solutions frequently encounter stagnation prompt; thus, there is no longer the 
while solving their problems due to the oversaturation of proposed _ solutions. 
inevitability of previously uncovered questions Nevertheless, such models also have their 
arising. There is also a possibility that the limitations. First, they are trained on the data 
provided knowledge platform is hard to only up to a certain point in time, which causes 
navigate or query for the required information. the models to ignore current events. Moreover, 
Finally, developers that utilize low-code they frequently “hallucinate” — make up the 
platforms often try to integrate multiple response without backing evidence. Finally, 


they provide no sources for their claims even 
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in the event of a correct answer, leaving the 
user in doubt of the legitimacy of the LLM’s 
statements. 

A potential solution can be found by 
combining the advantages of natural language 
processing from LLMs, vast domain 
knowledge of structured resources like product 
documentation, and practical — search 
algorithms. One such approach is retrieval- 
augmented generation (RAG) [2], which aims 
to augment the processing done by a language 
model with external knowledge from a 
document database. 


Related Work 

The concept of RAG [2] has been 
introduced in the context of open-domain 
question answering, abstractive question 
answering, Jeopardy question generation, and 
fact verification. It has been shown that a 
unified architecture can achieve state-of-the- 
art performance across many tasks. 

RAG has been applied in the software 
development process to gain a_ deeper 
understanding of the produced code [3]. Such 
insights were achieved by combining the 
benefits of Graph Neural Networks (GNNs) 
with a novel retrieval-augmented mechanism. 

The applications of RAG in coding were 
extended to code generation via REDCODER 
[4] — a retrieval-augmented framework that 
uses state-of-the-art dense retrieval to provide 
context to a generative model. 

The efficiency of the RAG pipeline has 
been considerably improved using 
Hierarchical Selection and Dense Knowledge 
Retrieval [5], reducing the computation time 
by a factor of 100 in a task-oriented dialog 
system. 

Several methods have been proposed to 
adapt RAG to heterogeneous knowledge [6]. 
One such approach is homogenizing 
knowledge from different domains to unified 
unstructured text. We can also utilize graph 
databases to facilitate multi-hop reasoning 
over heterogeneous structured sources. 

Another promising modification is a 
forward-looking active retrieval-augmented 
generation (FLARE) [7], which retrieves the 
documents based not only on the user’s input 
but also on the model’s prediction of the 
following statement and regenerates the 
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sentence in the event of low-confidence 
tokens. 

It has also been demonstrated that it is 
not necessary to modify the model architecture 
to incorporate external knowledge for the RAG 
process [8]. A simple concatenation of the 
retrieved documents to the input (in-context 
model) produces satisfactory results in 
language modeling and open-domain question 
answering. 

The following metrics have been 
suggested to evaluate the performance of 
different LLMs in RAG: noise robustness, 
negative rejection, information integration, 
and counterfactual robustness [9]. It was 
shown that the introduction of RAG has posed 
several challenges, even for state-of-the-art 
LLMs. 

The advantages of using an external 
knowledge source in RAG models and storing 
knowledge in parametric models were 
combined into a key-value memory [10] to 
improve the accuracy and execution time of 
question answering. 

The widespread interest in RAG has led 
to dedicated discussions on __ recent 
developments in this area [11]. They attract an 
audience interested in natural language 
generation and information retrieval, covering 
the topics of dialogue response generation, 
machine translation, text style transfer, etc. 

RAG has been utilized in multimodal 
models, resulting in a Multimodal Retrieval- 
Augmented Transformer (MuRAG), which 
employs external non-parametric memory to 
improve language generation [12]. MuRAG 
has achieved state-of-the-art accuracy on 
WebQA and Multimodal QA. 

Other sources consider using RAG in 
multimodal models as a way to integrate 
knowledge in a more scalable and modular 
way [13]. It has been shown that the resulting 
model outperforms DALL-E and CM3 on 
image and caption generation tasks while 
requiring much less compute resources for 
training. 

Particular attention has been given to 
constructing a valuable representation of the 
documents in the database. One such structure 
is a knowledge graph, which can be built using 
the accomplishments in a related task — slot 
filling [14]. An approach called KGI has been 
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shown to improve dense passage retrieval in 
the KILT benchmark. 


Suggested Method 

Let us examine the RAG model in 
greater detail. The process of RAG [2] is split 
into multiple steps. First, a query is received, 
and the model retrieves the best matches from 
the database using dense passage retrieval 
(DPR) [15]. Next, each match is used as 
context to an LLM [16], which produces a 
distribution for every output token. Finally, the 
distributions are marginalized based on the 


database 


contribution of each document. This process is 
illustrated in Fig. 1. 

RAG can provide low-code developers 
with relevant, up-to-date, evidence-driven 
responses. Consider the following example 
that demonstrates the advantages of RAG. 

A developer might need to combine two 
low-code platforms, Caspio (a database 
solution) and Power Automate (an automation 
solution). A potential question may be: “How 
do you get Caspio tables in Power Automate?” 
Such an experiment at the time of writing 
using gpt-3.5-turbo produces a response with 
numerous flaws. 


marginalization 


result 


Fig. 1. Illustration of RAG algorithm 


For instance, the model provides an 
outdated Caspio API endpoint (“The base URL 
usually follows the format: 
“https://caspio.com/rest/v 1/tables/{table_nam 
e}*”), does not elaborate on specific steps to 
get a Caspio API key (“You may need to 
include an API key or credentials in the 
headers to authenticate.”’), and fails to mention 
detailed actions to create a corresponding flow 
in Power Automate (“Open the Power 
Automate platform and create a new flow”). 

To eliminate the shortcomings of the 
standard model, the author proposes to 
construct a RAG model in the following way. 

First, a retrieval structure is chosen. 
Word embeddings [17] provide the advantage 
of quick queries to find phrases with similar 
meanings, so that is the approach taken in this 
article. OpenAlI Embeddings API 
provides text-embedding-ada-002 for creating 
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the embeddings, while Pinecone API stores 
them in a database with subsequent querying. 

Second, a _ generation model is 
established. OpenAI Chat Completion API 
gives access to state-of-the-art gpt-3.5-turbo, 
justifying its selection for this task. 

Third, official Caspio and Power 
Automate documentation data is downloaded 
and split into chunks, each representing a 
logical piece of information. An embedding 
algorithm processes the data segments, and the 
resulting embeddings are inserted into a vector 
database. One segment is depicted in Fig. 2. 
Score represents the cosine similarity between 
the queried and the stored vector on a scale 
from -1 if vectors are pointing in opposite 
directions to 1 when vectors are pointing in the 
same direction. The vector is in an n- 
dimensional space and is represented by 
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values. Text is a key in the metadata 
dictionary. 


BROWSER 


D VALUES 


176938034487450024 -0.0108384499, 0.0220762... 


-0.0349 METADATA 
text: "System: Caspio\nTopic: Caspio Get Table Description\n\nGET /v2/tables/{tableName}\n\... 


Fig. 2. An example of a typical vector stored in Pinecone 


When a query is received, its embedding is computed, and the cosine similarity [17] between 
the user’s question vector and each database vector is calculated using the following formula: 


cosine similarity = cos @ = 


ANIBI 


An example of two such vectors is presented in Fig. 3. 


“i 


Fig. 3. An illustration of two vectors, denoted as A and B, with an angle @ included between them 


The top five scoring vectors are returned This token is generated using the Client 
as containing the most relevant text. These ID/Secret pair that can be found on your 
vectors form the context for the subsequent Caspio REST API profile page... 
input to an LLM in the form of “Use all the Body:grant_type=client_credentials&client_ i 
information: {context} and answer this user’s d=<Your Client ID>&client_secret=<Your_ 
query in detail: {query}.” The LLM generates Client_Secret>...”); 
the corresponding response. e provides the correct — endpoint 

(“‘/v2/tables/{tableName}”’); 

Results e gives advice on handling the response in 

The resulting output of the model is up- Power Automate (“Parse the response: The 
to-date, relevant, and detailed, providing response will be in JSON format...”’). 
sufficient information to answer the user’s 
query. In the example considered, RAG model: Conclusion 

e elaborates on the necessary actions to get In summary, using RAG has 
a Caspio API token (‘Here are the steps: ... demonstrated considerable improvement in 
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output compared to a standard LLM. Such a 
solution will allow low-code developers to 
solve the arising issues during product 
development and free their time for further 
iteration. Although this work was conducted 
with low-code developers in mind, the results 
can apply to people without a_ technical 
background or _ professionals who use 
proprietary knowledge systems and require 
effortless search and summarization 
capabilities. The model may be enhanced by 
using a different way of storing and retrieving 
information, perhaps in a graph database [18], 
on which further research efforts will be 
focused. 
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